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Replacing the Soft FEC Limit Paradigm in the 
Design of Optical Communication Systems 

Alex Alvarado, Erik Agrell, Domani9 Lavery, Robert Maher, and Polina Bayvel 


Abstract —The FEC limit paradigm is the prevalent practice 
for designing optical communication systems to attain a certain 
hit-error rate (BER) without forward error correction (EEC). 
This practice assumes that there is an EEC code that will 
reduce the BER after decoding to the desired level. In this 
paper, we challenge this practice and show that the concept of 
a channel-independent EEC limit is invalid for soft-decision hit- 
wise decoding. It is shown that for low code rates and high order 
modulation formats, the use of the soft EEC limit paradigm can 
underestimate the spectral efficiencies hy up to 20%. A better 
predictor for the BER after decoding is the generalized mutual 
information, which is shown to give consistent post-EEC BER 
predictions across different channel conditions and modulation 
formats. Extensive optical full-held simulations and experiments 
are carried out in both the linear and nonlinear transmission 
regimes to conhrm the theoretical analysis. 


I. Introduction and Motivation 

Forward error correction (FEC) and multilevel modulation 
formats are key technologies for realizing high spectral effi¬ 
ciencies in optical communications. The combination of FEC 
and multilevel modulation is known as coded modulation 
(CM), where EEC is used to recover the sensitivity loss 
from the nonbinary modulation. While in the past optical 
communication systems were based on hard-decision (HD) 
EEC, modern systems use soft-decision EEC (SD-EEC). 

Current digital coherent receivers are based on powerful 
digital signal processing (DSP) algorithms, which are used 
to detect the transmitted bits and to compensate for channel 
impairments and transceiver imperfections. The optimal DSP 
should find the most likely coded sequence. However, this 
is hard to realize in practice, and thus, most receivers are 
implemented suboptimally. In particular, detection and PEC 
decoding are typically decoupled at the receiver: soft informa¬ 
tion on the code bits is calculated first, and then, an SD-PEC 
decoder is used. We refer to this receiver structure as a bit-wise 
(BW) decoder, also known in the literature as a bit-interleaved 
coded modulation (BICM) receiver HI, ||2|, owing its name to 
the original works 0, a, where a bit-level interleaver was 
included between the PEC encoder and mapper. In the context 
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of optical communications, BW decoders have been studied, 
e.g., in E-Hni. 

An alternative to BW decoders is to use iterative demapping 
(ID) and decoding, i.e., when the PEC decoder and demapper 
exchange soft information on the code bits iteratively. This 
structure is known as BICM-ID and was introduced in HD- 
113. BICM-ID for optical communications has been studied 
in m-Ha, Ol sec. 3], ini sec. 3], US Sec. 4]. Due to the 
inherent simplicity of the (noniterative) BW receiver structure, 
BICM-ID is not considered in this paper. 

Por simplicity, researchers working on optical communica¬ 
tions typically use offline DSP. In this case, and to meet higher- 
layer quality of service requirements, the bit-error rate (BER) 
after PEC decoding—in this paper referred to as post-PEC 
BER or BERpost—should be as low as 10“^^ or 10“^®. Since 
such low BER values cannot be reliably estimated by Monte- 
Carlo simulations, the conventional design strategy has been to 
simulate the system without PEC encoding and decoding, and 
optimize it for a much higher BER value, the so-called “PEC 
limit” or “PEC threshold”. The rationale for this approach, 
which we call the FEC limit paradigm, is that a certain 
BER without coding—here referred to as pre-PEC BER or 
BERpre—supposedly can be reduced to the desired post-PEC 
BER by previously verified PEC implementations. 

The use of PEC limits assumes that the decoder’s perfor¬ 
mance is fully characterized by BERpre, and that different 
channels with the same BERpre will result in the same BERpost 
using a given PEC code. Under some assumptions on indepen¬ 
dent bit errors (which can be achieved by interleaving the code 
bits), this assumption is justifiable, if the decoder is based on 
HDs. This is the case for HD-PEC, where the decoder is fed 
with bits modeled using a binary symmetric channel (BSC). 
The use of PEC limits, however, has not changed with the 
adoption of SD-PEC in optical communications, which has 
made the “SD-PEC limit” to become increasingly popular in 
the optical communications literature. 

The application of SD-PEC in optical communications dates 
back to the pioneering experiments by Puc et al. in 1999 
im, who used a concatenation of a Reed-Solomon code 
and a convolutional code. Other early studies of SD include 
block turbo codes Eoi, m and low-density parity-check 
(LDPC) codes Il22ll - ll24ll . Another concatenated code suitable 
for SD decoding was defined for optical submarine systems 
by the ITU in the G.975.1 standard ESl. See ll26l. ETl. and 
references therein for further details on SD-PEC in optical 
communications. 

Tables and plots of BERpost vs. BERpre were presented in, 
e.g., im, 1241, Ga, under specific choices for the chan- 
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nel, modulation format, and symbol rate. Although this was 
not suggested when these tables and plots were originally 
published, the existence of such data has subsequently been 
adopted to avoid the need for including FEC in system 
simulations and experiments. This SD-FEC limit paradigm 
is nowadays very popular in optical communication system 
design. It has been used for example in the record experiments 
based on 2048 quadrature amplitude modulation (QAM) for 
single-core m and multi-core Il29ll hbers. It has, however, 
never been validated to which extent the function BERpost 
vs. BERpre, determined for one set of system parameters (chan¬ 
nel, modulation, symbol rate, etc.), accurately characterizes the 
same function with other parameters. 

Another option to predict the post-EEC BER is to use the 
mutual information (MI) between the input and output of the 
discrete-time channel. This approach was suggested in 130)- 
and applied to optical communications in 1^ . In ll^ . 
it was shown that the MI is a better metric than the pre-EEC 
BER in predicting the post-EEC BER, which casts signihcant 
doubts on the SD-EEC limit paradigm. 

This paper investigates the usage of the generalized mutual 
information (GMI) Q] Sec. 3], 12] Sec. 4.3] for the same 
purpose. The GMI, also known as the BICM capacity (or 
parallel decoding capacity), was introduced in an optical 
communications context in Q. The performance of some 
LDPC codes with four-dimensional constellations over the 
additive white Gaussian noise (AWGN) channel was evaluated 
in terms of the GMI in 1341 . With any given LDPC code, 
an apparent one-to-one mapping was observed between the 
GMI and the post-EEC BER, regardless of the constellation 
used. In this paper, which extends the conference version 
l35l . we investigate this mapping further and show that the 
GMI is a very accurate post-EEC BER predictor, signihcantly 
more accurate than both the pre-EEC BER and the MI, under 
general condition^ Consistent results were obtained for the 
nonlinear optical channel in both linear and nonlinear regimes, 
for the AWGN channel, for both LDPC codes and turbo codes, 
for a variety of modulation formats, and also validated by 
experiments. 

This paper is organized as follows. In Sec. |II| the system 
model is introduced and principles for FEC are reviewed. 
Sec. nni introduces achievable rates, which are quantihed by 
the MI and GMI. The post-EEC BER prediction is studied in 
sec.ng Conclusions are drawn in Sec. 0 

II. Preliminaries 
A. Channel and System Model 

In this paper, we consider the CM transceiver shown in 
Fig.m which is the common for coherent optical communi¬ 
cation systems. Data is transmitted in blocks of 2n symbols, 
where every block represents n time instants in each of the 
two polarizations. At the transmitter, an outer encoder is 

*One of these conditions is that the binary code under consideration is 
universal, i.e., that its performance does not depend on the distribution of 
the soft information passed to the decoder, but only on the capacity of the 
channel (36) Sec. 9.5]. The universality property of LDPC codes for binary- 
input memoryless channels was initially discussed in [321 . [371 . later studied 
in, e.g., [38[ . [391 . and recently for spatially-coupled LDPC codes in Eo). 


TABLE I 

Summary of system parameters used in WDM simulation. 


Parameter 

Value 

Fiber attenuation 

0.2 dB/km 

Dispersion parameter 

17 ps/nm/km 

Fiber nonlinear coefficient 

1.2 (W km)-l 

Span length 

L km 

PMD 

0 ps/Vkm 

Symbol rate 

32 Gbaud 

EDFA noise figure 

3 dB 

WDM channels 

11 

Channel separation 

50 GHz 

Pulse shape 

RRC, 1% rolloff 


serially concatenated with an inner FEC encoder with code 
rate Re- The inner encoder generates code bits C^,..., C^, 
where Cl = ... ,ClJ, A: = 1, 2,..., m is the 

bit position and p € {x, y} indicates the polarization^ The 
code bits for each polarization are fed to a memoryless M-ary 
QAM (MQAM) mapper with M = 2"* constellation points 
X = {xi,X 2 , ■ ■ ■ ,xm}- We consider Gray-mapped square 
QAM constellations with M = 4,16,64, 256 as well as (non- 
Gray) 8QAM from gH Fig. 14 (a)]. 

The transmitted sequences of complex symbols X'^ = 
[Xl, X 2 , ■ ■ ■, X^] with Xi C A” is modulated using a root- 
raised-cosine (RRC) pulse with 1% rolloff. The symbols in 
the two polarizations are combined into the matrix 




rx]' 

X2^ . 

■ 



[x]' 

X,^ . 

■ XI. 


and sent through a nonlinear optical channel, whose pa¬ 
rameters are summarized in Table |I| We consider 11 dual¬ 
polarization wavelength-division multiplexed (WDM) chan¬ 
nels of 32 Gbaud in a 50 GHz grid over a single span of single 
mode hber (SMF) of length L with zero polarization mode 
dispersion (PMD). At the receiver, an erbium-doped hber am- 
pliher (EDEA) with an ideal noise hgure of 3 dB (spontaneous 
emission factor = 1) is used. The digital signal processing 
(DSP) in the receiver includes electronic chromatic dispersion 
compensation (EDC) and matched hltering followed by ideal 
data-aided phase compensatioifl Data for the central channel 
is recorded and represented (for the two polarizations) by 
the received matrix Y] of size 2 by n, where G C for 
I = 1,2,... ,n and p £ {x,y}. 

As shown in Fig. [T] the optical channel is modeled by the 
channel law /y|x(y|®)0 This discrete-time model encom¬ 
passes all the transmitter DSP used after the MQAM mapper 
(i.e., pulse shaping and polarization multiplexing), the physical 
channel (the hber and the EDEA), and the receiver DSP. 

Even though some residual intersymbol interference usually 
remains after EDC and the received symbols are affected 

^Throughout this paper, boldface symbols denote random vectors. 

^In our ideal phase compensation algorithm, the nonlinear phase noise of 
each received symbol is compensated by multiplying the received symbol 
by exp {—jSi) with i = 1,... , M, where 9i is the average phase rotation 
experienced by all the received symbols Y such that X = Xi. 

'^Throughout this paper, /_4 (a) denotes a probability density function (PDE) 
and fA\B{f^) 3 conditional PDE. Similarly, Pa{o) — Pr{A = a} denotes 
a probability mass function (PME) and PA\B{<rf) — Pr{A = a\B = 6} a 
conditional PME. 
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Outer 

FEC Encoder 


Inner 

FEC Encoder 


Inner 

SD-FEC Decoder 


Outer 

HD-FEC Decoder 



Fig. 1. Dual-polarization CM transceiver with SD-FEC under consideration. The transmitter for each polarization consists of two cascaded binary FEC encoders 
followed by an MQAM mapper. The receiver is a BW receiver: L-values are calculated by the demapper (ignoring the intersymbol and interpolarization 
interference), which is followed by an SD-FEC decoder and an HD-FEC decoder. 


by interpolarization interference, these effects are typically 
ignored in current receivers, to reduce complexity. Hence, 
each symbol in Y_ is decoded separately in both time and 
polarization. More specifically, for each I = l,...,n and 
p € {x, y}, soft information on the code bits C'j’, C'^ ^ is 
calculated in the form of L-value^ also known as logarithmic 
likelihood ratios, as 


Ll ^ log 




j p,apo j p,apri 


( 2 ) 

( 3 ) 


where k = 1,..., m and 


T P,apo 
^k,l 


^p,apri 


log 




log 




( 4 ) 

( 5 ) 


are the a posteriori and a priori L-values, respectively. 

A stationary channel model is assumed, and thus, the index 
I can be dropped. Furthermore, the performance in both 
polarizations is expected to be identical, so from now on, the 
notation (■)p is also dropped. Using this and the law of total 
probability in (|2]i gives 


Px\cM^)fY\x{y\x) 
^ T,xGX° Px\Ckix\0)fY\x{y\x) 


where C X is the set of constellation symbols labeled by 
a bit 6 G B = {0,1} at bit position k G {l,...,m}. The 


sign operation on an L-value corresponds an HD. Its magnitude 
represents the reliability of the HD. 


L-values calculated by the demapper are then passed to the 
SD-FEC decoder. The SD-FEC decoder makes a decision on 
the bits fed into the inner encoder. These bits are then used 
by the outer HD-EEC decoder, as shown in Eig. [T] 

To alleviate the computational complexity of (|6]l, the well- 
known max-log approximation Il42l 

, , max,,g;t.iPx|Cfc(a;|l)/y|x(2/|a:) 

^ - V —nTvTT— 

Px\Ckix\0)fY\x{y\x) 

is often used. 


B. Pre-FEC BER 

The lower branch of the receiver in Eig. [T] includes an 
HD demapper which makes an HD on the code bits. We 
assume that this HD demapper is the optimal memoryless HD 
demapper in the sense of minimizing the pre-EEC BER. This 
maximum a posteriori (MAP) decision rule is equivalent to 
making an HD on the a posteriori L-values in (IHi; if L’^^° > 0 
then Ck = 1, and = 0 otherwise]^ Eormally, 

^ m 

BERppe ^ - y Pr{Cfc / (8) 

-i m 

= (c) Pr{4 ^c\Ck= cj (9) 

k—l cGB 

1 ^ pOO 

= -Y.Y.pcM / fLT\c,{{-iri\c)di. (10) 

A.-1 cGM 

^This decision rule is slightly better than the standai'd demapper based 
on HDs on the symbols followed by a symbol-to-bit mapper (inverting the 
bit-to-symbol mapping used at the transmitter). However, the differences are 
noticeable only at very high pre-FEC BER (45] Sec. V]. 
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The pre-FEC BER is a standard performance measure for 
uncoded systems. As discussed in Sec. III-DI pre-EEC BER 
is a good predictor of post-EEC BER for HD-EEC with ideal 
interleaving. We will show in Sec. |I3 that the pre-EEC BER 
is not necessarily a good predictor of post-EEC BER for 
SD-EEC. 

C. SD-FEC 

We consider two families of binary SD-EEC; Turbo codes 
(TCs) and irregular repeat-accumulate LDPC codes. In both 
cases, a pseudo-random bit-level interleaver is assumed to 
be used prior to modulation (see Eig. [T]). Without loss of 
generality, we assume this interleaver to be part of the inner 
EEC encoder. 

The TCs we consider are formed as the parallel concate¬ 
nation of two identical, eight-state, recursive and system¬ 
atic convolutional encoders with code rate 1/2. The gen¬ 
erator polynomials are (1,11/15)8 ll44l and the two en¬ 
coders are separated by an internal random interleaver, giv¬ 
ing an overall code rate Rc = 1/3. Six additional code 
rates i?c G {2/5,1/2, 3/5, 2/3, 3/4, 5/6} are obtained by 
cyclically puncturing parity bits using the patterns defined 
in na and ll45ll . which leads to EEC overheads (OHs) of 
{200,150,100,66.6,50,33.3,20}%. Each transmitted frame 
consists of 20, 000 information bits. The decoder is based on 
the max-log-MAP decoding algorithm with ten iterations. The 
extrinsic L-values exchanged during the iterations are scaled 
by 0.7 as suggested in ll46l . 

The LDPC codes we consider are those proposed by the 
second generation satellite digital video broadcasting standard 
Il47l with code rates £ {1/3,2/5,1/2,3/5,3/4,9/10}. 
This leads to OHs of {200,150,100,66.66,33.3,11.1}%. 
Each transmitted frame consists of 64,800 code bits. The 
decoder uses the message passing algorithm with 50 iterations 
and exact L-values. 

What the SD-EEC encoder and decoder pair “sees” is a 
binary-input soft-output (BISO) channel. This is shown in 
Eig. |2] (a). This BISO channel is sometimes known in the 
literature as the BICM channel HHl Eig. 1] and it has been used 
to predict the decoder performance via probabilistic models of 
the L-values ^ Sec. 5.1]. In this paper, we are interested in 
finding a measure to characterize this BISO channel in order 
to predict the post-EEC BER across different channels. 


crossover probability given by the BER after SD-EEC decod¬ 
ing (BERpost). Therefore, the BER after HD-EEC decoding 
can be assumed to be 10“^^ for BERpost = 4.7 • 10“^. This 
is shown in Eig. (b). Erom now on, we therefore assume 
the existence of the interleaver and staircase code, and thus, 
without loss of generality, we focus on a target BER after 
SD-EEC decoding of BERpost = 4.7 • lO’^. 

III. Achievable Rates 

Achievable rates provide an upper bound on the number of 
bits per symbol that can be reliably transmitted through the 
channel. In this section we review achievable rates for channels 
with memory, for optimal decoders, and for BW decoders. 
These achievable rates will be used in Sec. |IV] to predict the 
post-EEC BER. 


A. Channels with Memory 

A coding scheme consists of a codebook, an encoder, and 
a decoder. The codebook is the set of codewords that can 
be transmitted through the channel, where each codeword is 
a sequence of symbols. The encoder is a one-to-one map¬ 
ping between the information sequences and codewords. The 
decoder is a deterministic rule that maps the noisy channel 
observations onto an information sequence. 

A code rate, in bits per (single-polarization) symbol, is said 
to be achievable at a given block length and for a given 
average error probability e if there exists a coding scheme 
whose average error probability is below e. Under certain 
assumptions on information stability ll50l Sec. I], and for 
any stationary random process {Xi} with joint PDE fx_, an 
achievable rate for channels with memory (i.e., where symbols 
are correlated in time and across polarizations) is given by 

i?”™ = lim — (11) 

Ti^oo 2n 

where I (X; H) is the mutual information defined as 


/(X;Z) = Ex.v 



MX.) - 


( 12 ) 


and where Ex,v denotes the expectation with respect to both 
X and X. The channel capacity is the largest achievable rate 
for which a coding scheme with vanishing error probability 
exists, in the limit of large block length. 


D. HD-FEC 

As shown in Eig. [T] the considered transceiver includes an 
outer encoder to reduce the BER after SD-EEC decoding to 
10“^®. Eor both TCs and LDPC codes, we use the staircase 
code with 6.25% OH from El Table I]. Eor a BSC, this 
staircase code guarantees an output BER of 10“^^ for a 
crossover probability of 4.7 • 10“^. This corresponds to the 
HD-EEC limit paradigm, which is perfectly justifiable under 
the BSC assumptions. 

To guarantee that the errors introduced by the inner SD-EEC 
decoder are independent within a frame, we include a bit- 
level interleaver (see Eig. [TJ. Under these assumptions, what 
the HD-EEC encoder and decoder pair “sees” is a BSC with 


B. Memoryless Receivers 

Although the discrete-time optical channel in Sec. III-AI 
suffers from intersymbol and interpolarization interference, the 
standard receiver considered in this paper ignores these effects. 
In particular, each polarization is considered independently 
(see Eig. [T]), and the soft information on the coded bits 
is calculated ignoring correlation between symbols in time 
(see (|2]i). To model these assumptions made by the receiver, 
the channel is modeled by a conditional PDE /y|x(E|X). 
Therefore, from now on, and without loss of generality, only 
one polarization is considered. Eurthermore, we assume the 
symbols are independent random variables drawn from a 
distribution fx- 
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Inner 

EEC Encoder 


Binary-Input Soft-Output Channel 
Characterized by GMI 


Inner 

SD-FEC Decoder 


(a) Interface for SD-FEC 


I Outer 
I FEC Encoder 


Binary Symmetric Channel 
Characterized by Crossover Probability BERpost 


( 1 

J Outer * 
I HD-FEC Decoder ' 


(b) Interface for HD-FEC 


Fig. 2. Interface for (a) the inner SD-FEC and (b) the outer HD-FEC in Fig. ^ (for one polaiization). The BISO channel characterized by the GMI and the 
BSC by its crossover probability given by the BER after SD-FEC decoding BERpost- 


An achievable rate for transceivers that ignore intersymbol 
and interpolarization interference is 


I{X-Y) = ^x,y 



fY\x(X\X) 
fY{Y) _ 


(13) 


where I{X;Y) is the unidimensional version of the MI in 
dEli. As expected, i?”™ > I{X-Y) ISU Sec. III-F] and 
thus, I{X\Y) is a (possibly loose) lower bound on the 
capacity of the channel with intersymbol and interpolarization 
interference. 

Let C the binary codebook used for transmission and c 
denote the transmitted codewords as 


Cl,l 


c = 


Cl,2 



(14) 


Cm,l 


Furthermore, let B = \Bi ,..., Bm\ be a random vector 
representing the transmitted bits [ci,Cm.z] at any time 
instant I, which are mapped to the corresponding symbol 
Xi € X with I = 1,2,... ,n. Assuming a memoryless chan¬ 
nel, the optimal maximum-likelihood (ML) receiver chooses 
the transmitted codeword based on an observed sequence 
\yi, ..., according to the rule 


n 


c”' = argmax Vlog/y|B( 2 /i|ci,i, ■ • -Xm.i)- (15) 

1=1 


Shannon’s channel coding theorem states that reliable trans¬ 
mission with the ML decoder in (fTSl l is possible at arbitrarily 
low error probability if the combined rate of the binary encoder 
and mapper (in information bit/symbol) is below I{X;Y), i.e., 
if Rem < I{X-,Y). 

For a discrete constellation X, the MI in (fT3l l can be 
expressed as 

I{x-, Y) = Y, Px{x) [ fY\x{y\x) log 2 dy. 

/c fy(y) 

(16) 


A Monte-Carlo estimate thereof is 


I{X-,Y) 


1 

n 


Px{x)^l0g2 

z=i 




(17) 


where vvith I = 1,2,... ,n are independent and identically 
distributed (i.i.d.) random variables distributed according to the 
channel law fY\x{y\x)- 


C. BW Receivers 

As shown in Fig.[T] the BW decoder considered in this paper 
splits the decoding process. First, L-values are calculated, and 
then, a binary SD decoder is used. More precisely, the BW 
decoder rule is 

n m 

c'’"' = argmax Vlog TT /y|Bj2/;|cfc,;)- (18) 

;=1 fc=i 

The BW decoding rule in (fTsT l is not the same as the ML rule 
in (fTST i and the MI is in general not an achievable rate with a 
BW decoder^ 

The BW decoder can be cast into the framework of a 
mismatched decoder by considering a symbol-wise metric 

m 

y{b,y) = WfY\Bki.yH)- (19) 

k^l 

Using this mismatched decoding formulation, the BW rule in 
Cl can be expressed as 


c'’"' = argmax log g (6;, y;) (20) 

1=1 

where with a slight abuse of notation we use hi = 
..., Similarly, the ML decoder in (fTSl) can be 

seen as a mismatched decoder with a metric q{bi,yi) = 
fY\B{yi\bi) = fY\x{yi\xi) which is “matched” to the chan¬ 
nel. Using this interpretation, the BW decoder uses metrics 
matched to the bits /y|B^ but not matched to the actual 

(symbol-wise) channel. 

An achievable rate for a BW decoder is the GMI, which 
represents a bound on the number of bits per symbol that 

^An exception is the trivial case of Gray-mapped 4QAM (i.e., quadrature 
phase-shift keying, QPSK) with noise added in each quadrature independently. 
In this case, the detection can be decomposed into two binary phase-shift 
keying constellations, and thus, ML and BW decoders are identical. 
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can be reliably transmitted through the channel. The GMI is 
defined as |l52l eq. (59H60)] E (4.34H4.35)] 


GMI = maxEs y 

s>0 


l0g2 


q{B,YY 


i:beBr.PB{b)q(b,Yy 


( 21 ) 


For the BW metric in (HI and assuming independent bits 
Si,, Bm, the GMI in (1211) can be expressed as 


GMI = maxY^ fEst. i 

s>0 ^ 


k=l 


log 


fY\BYy\BkY 


k^l 
m 

= J2liBk;Y) 


log2 


"EbeMPB.ib)fYiBdY\bY\ 

( 22 ) 

fY\Bk(X\Bk) 


SfeeB Bb^ {b)fY\Bk {Y\b)_ 


(23) 

(24) 


fc=i 


where (l22l i follows from E Theorem 4.11] and (|2^ from E 
Corollary 4.12] (obtained with s = 1). The expression in (l24li 
follows from the definition of MI in (fTlT l. 

In general, I{X;Y) > GMI E Theorem 4.24jl, where 
the rate penalty I{X;Y) — GMI can be understood as the 
penalty caused by the use of a suboptimal (BW) decoder. This 
rate penalty, however, is known to be small for Gray-labeled 
constellations E Fig. 4], 153], El, lES] Sec. IV]. 

The GMI has not been proven to be the largest achievable 
rate for the receiver in Fig. [T] For example, a different 
achievable rate—the so-called LM rate—has been recently 
studied in ll5^ Part I]. Moreover, in the case where unequally 
likely constellation points are allowed, a new achievable rate 
has been recently derived in ll57] Theorem 1]. Finding the 
largest achievable rate with a BW decoder remains an open 
research problem. Despite this cautionary statement, the GMI 
is known to predict well the performance of CM transceivers 
based on capacity-approaching SD-FEC decoders. This will 
be shown in Sec. m 

When the L-values are calculated using (|6]), I{Bk]Y) = 
I{Bk;Lk) E Theorem 4.21], and thus, the GMI in (l24li 
becomes 


GMI = ^/(Bfc;Lfc) (25) 


i.e., the GMI is a sum of bit-wise Mis between code bits 
and L-values. The equality in (1251 ) does not hold, however, if 
the L-values were calculated using the max-log approximation 
©, or more generally, if the L-values were calculated using 
any other approximation. Lor example, when max-log L-values 
are considered, it is possible to show that there is a loss in 
achievable rates. Under certain conditions, this loss can be 
recovered by adapting the max-log L-values, as shown in ll58l - 
iQ). 

Regardless of the L-value calculation, the GMI in (l22li can 


*The condition of i.i.d. bits in (2] Theorem 4.24] is not necessary—only 
independence is needed. 


be estimated via Monte-Carlo integration as E Theorem 4.20] 

m 

GMI«^iJb(PB,(0))- 

k^l 

^ m n 

“ ^^0 ^ ^ ^ (26) 

~ k—l n—1 

where b = l,2,...,n are i.i.d. random variables 

distributed according to the PDL of the L-values (A|6) 

and H\,{p) = —plog 2 (p) — (1 — p) log 2 (l — p) is the binary 
entropy function. The maximization over s in (l26l l can be 
easily approximated (numerically) using the concavity of the 
GMI on s E eq. (4.81)]. 

We emphasize here that the expression in (l26l) is valid for 
any symbol-wise metric in the form of (fTOl l. i.e., for any L- 
value Lk that ignores the dependency between the bits in the 
symbol. In particular, when the L-values are calculated exactly 
using (|3, the GMI can be estimated using (|2^ and s = 1, 
which follows from E Theorem 4.20]. 


D. AWGN Channel 


Often, if not always, CM transceivers in optical communica¬ 
tion systems assume that the discrete-time channel, including 
transmitter- and receiver-side DSP, is a memoryless AWGN 
channel Y = X -\- Z, where Z is a complex, zero-mean, 
circularly symmetric Gaussian random variable with total 
variance E[|Z|]^. This assumption might be suboptimal, but 
in the absence of a better (non-Gaussian) model with memory, 
the memoryless AWGN channel assumption is reasonable. In 
this subsection, we specialize the MI and GMI estimators in 
(fTTl i and (l26l l to the AWGN channel and equally likely input 
bits (and therefore, equally likely symbols in X). 

Lor the AWGN channel and a uniform input distribution, 
the MI in (fTbl l can be estimated using (fTTl i as 

^ M n 

I{X;Y)Ki log2(M) - (27) 


fi,i = ^exp(-p(2]R{(x^ - XjYz^^')} (28) 
2=1 


the signal-to-noise ratio (SNR) p is defined as p = 
Ex[|Aip]/E[|Z|]^, and with I = 1,2, ...,n are n in¬ 
dependent realizations of the Gaussian random variable Z. 

L-values may be calculated either exactly or using the max- 
log approximation. In the first case, the exact L-values in (|6j 
are calculated as 


Lk 


ExeA/i exp(-p|y - x^) 

E.e;Toexp(-p|y-xP) 


(29) 


where we used the uniform input symbol distribution assump¬ 
tion. Lor given sequences of ran transmitted bits Ck,i and 
ran L-values Xkj computed via ( |29] |. for k = 1,... ,m and 
I = 1,..., n, the GMI in (l26l) can be estimated as 


GMI«m - - ^^log 2 (l + e(-^)“'“’'^'=''). (30) 

^ 1^1 












PREPRINT, 19/03/2015, 00:46 


7 


In the second case, the max-log L-values in Q are calculated 
as 

Lk ^ p \ min \y - - min \y - xf ] . (31) 

\xex° ) 

For given sequences of transmitted bits Ck^i and max-log L- 
values \k,i computed via OTT i. the GMI can be estimated using 
(l26l l as 

^ m n 

GMI m-min EE log2(l + e(-i)°''’'^'=''). (32) 

^ k=l l=l 

It is important to note at this point that to calculate the GMI, 
Pol l and (I 32 I 1 should be used for exact and max-log L-values, 
respectively. Using (l30l l for max-log L-values results in a rate 
lower than the true one, i.e., the minimization over s in (l32l i 
is a mandatory step for approximated L-values. 

IV. Post-FEC BER Prediction 

In this section, we study the robustness of three different 
metrics to predict the post-EEC BER of SD-EEC: the pre-EEC 
BER, the MI, and the GMI. The aim is to find a robust 
and easy-to-measure metric that can be used to predict the 
post-EEC BER of a given encoder and decoder parr across 
different channels. Results for the AWGN channel are shown 
first, followed by results for the nonlinear optical channel. 

A. AWGN Channel 

To study the post-EEC BER prediction across different 
BISO channels (see Eig. |2] (a)), we consider the TCs defined 
in Sec. Ill-Al and four modulation formats; M-QAM constella¬ 
tions with M = 4,8, 64, 256. Eor M = 4,64, the SD decoder 
uses exact L-values and for M = 8, 256, max-log L-value^ 
In Eig. [ 3 ] (a), the post-LDPC BER is shown as a function of 
BERpie for the 24 cases. Ideally, all the lines for the same rate 
(same color) should fall on top of one another, indicating that 
measuring BERpre is enough to predict BERpost when the BISO 
channel (in this case, the modulation format) changes. The 
results in this figure show that this is not the case, especially 
for low and medium code rates. The pre-EEC BER therefore 
fails to predict the performance of the SD-EEC decoder across 
different BISO channels. 

To estimate the inaccuracy of the SD-EEC limit paradigm, 
consider the results for 4QAM and i?c = 1/3 shown in 
Eig. [3(a). Eor a target post-EEC BER of BERpost = 4.7• 10“^, 
the required pre-EEC BER is BERpre ~ 0.2. By using the 
SD-EEC limit paradigm, we can conclude that to guarantee 
the same for post-EEC BER for 256QAM, the same pre-EEC 
BER can be assumed (BERpre ~ 0.2). This is clearly not the 
case, as for 256QAM and i?c = 1 /3, the pre-EEC BER can be 
higher (BERpre ~ 0.23). An alternative interpretation of this 
is that the results in Eig. [3 (a) show that for BERpre ~ 0.2 
and 256QAM, the code rate can be increased to Rc = 2/5. 
This shows that the use of the the SD-EEC limit paradigm 
in this scenario leads to an underestimation of the spectral 

®For M = 256, the use of max-log L-values is very relevant in practice as 
the calculation in [29) is greatly simplified. 



(a) Post-FEC BER vs. pre-EEC BER 



(b) Post-FEC BER vs. normalized MI 



(c) Post-FEC BER vs. normalized GMI 

Fig. 3. Post-FEC BER for TCs with Rc 6 {2/5,1/2, 3/5, 2/3, 3/4, 5/6} 
(colors) and different modulation formats (markers): 4QAM, 8QAM, 64QAM, 
and 256QAM. The post-FEC BER is shown versus (a) pre-FEC BER, (b) 
normalized MI, and (c) normalized GMI. The L-values for 8QAM and for 
256QAM are calculated using the max-log approximation. 
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efficiency of 20%. Very similar conclusions can be in fact 
drawn for the LDPC codes shown in ll?4l Fig. 4]. We also 
conjecture that the use of the SD-FEC limit paradigm in the 
record results reported in Il28l . Il29l (where a pre-FEC BER 
threshold obtained for 4QAM was used for 2048QAM) are 
in fact incorrect and even higher spectral efficiencies can be 
obtained. In this case, however, we expect the underestimation 
to be below 5%. 

The results in Eig.[3](a) show the variations on the required 
pre-EEC BER to guarantee a given post-EEC BER across 
different modulation formats. While for low code rates these 
variations could lead to errors of up to 20% in spectral 
efficiencies, the errors decrease as the code rate increases. This 
partially suggest that the pre-EEC BER is a relatively good 
metric for high code rates, however, we have no theoretical 
justification for the use of BERpre to predict the performance of 
a SD-EEC. Eurthermore, we believe that having a metric that 
works for all code rates is important. Considering only high 
code rates—as is usually done in the optical community—is 
an artificial constraint that reduces flexibility in the design, as 
pointed out in lIMI Sec. II-B]. 

An intuitive explanation for the results in Eig. [3 (a) is that 
the SD-EEC in Eig. [T] does not operate on bits, and thus, a 
metric that is based on bits (i.e., the pre-EEC BER) cannot be 
used to predict the performance of the decoder. To clarify this, 
we compare 8QAM and 64QAM for Rc = 1/3 and a target 
BERpie « 0.216. Exact L-value calculations are considered in 
both cases. Erom Eig. [3(a) we see that BERpost ~ 5 • 10“"^ for 
64QAM. Eor 8QAM, this value is BERpost ~ 5 • 10“^, which 
is slightly lower than the one shown in Eig. [3(a) for max-log 
L-values. In Eig. [4|we show the PDF0 


fL\B{l\b) 


1 

2m 


+ /Lfc|Bfc(-^|l - b). 

k=l 


(33) 


The PDE in (l33T l corresponds to the conditional PDE of 
“symmetrized” and “mixed” L-values. Eor exact L-values, this 
PDE has been recently shown in ll^ Sec. V] to fully determine 
the GMI (via GMI = mI{B; L)). Under the uniform bit 
probability assumption, the pre-EEC BER in (ITOl i can be 
expressed as 


BER, 


" 2m ^ 




(/Lfc|Bfc(-^|0) + /Lj,|BjZ|l))d/ 

) 

(34) 


and thus, it is clear that the pre-EEC BER can be calculated 
by 

BERp,e= [ fL\Bm)dl (35) 

where /l|_b(^|^) is given by d^ . 

While both PDEs /l|b(^| 1) in Eig.[4|give the same pre-EEC 
BER (BERpie - 0.216), the post-EEC BER for 64QAM is 
much lower than the one for 8QAM. This can be explained 
by the different shapes of the PDEs in Eig. [4| In particular, 
the slow-decaying right tail of the PDE of 64QAM shows that 


'^Estimated via histograms. 



Fig. 4. Conditional PDF of the F-values /l|s(^| 1) in 1331 for 8QAM and 
64QAM. In both cases the L-values are calculated using {29). 


some L-values with high reliability (i.e., high magnitude) will 
be observed, which the iterative SD-EEC decoder can exploit. 

Using BERpre to predict the performance of SD-EEC de¬ 
coders has no information-theoretic justification. To remedy 
this, one could consider the symbol-wise MI I{X;Y) (see 
Eig. [TJ as a metric to better predict BERpost. The values of 
BERpost as a function of the normalized MI I{X]Y)/m are 
shown in Eig. [3 (b). In this case too, the prediction does not 
work well across all rates. In particular, we note that although 
for square QAM constellations (M = 4, 64, 256) the MI seems 
to work well for high code rates (as previously reported in ll34l 
Sec. Ill]), this is not the case if 8QAM is considered. The MI 
then appears to be less reliable to predict BERpost than the 
pre-EEC BER. 

One intuitive explanation for the results in Eig. [3 (b) is 
that the MI is an achievable rate for the optimum receiver in 
(ITsT l. but not for the (suboptimal) receiver in Eig. [T] (see (ITSl l). 
Another explanation is related to the performance dependence 
of BERpost on the binary labeling of the constellation. It is 
nowadays well understood that for the receiver in Eig. [T] the 
performance of the SD-EEC decoder depends on the binary 
labeling; Gray (or quasi-Gray) labelings are known to be 
among the best. On the other hand, the MI does not depend 
on the binary labeling but only on the constellation. Thus, 
it is not surprising that a labeling-independent metric fails at 
predicting the labeling-dependent BERpost. 

The third and last metric we consider to predict BERpost 
is the GMI. The rationale behind this is that an SD-EEC 
decoder is fed with L-values, and thus, the GMI (see (l25T l) 
is an intuitively reasonable metric. The values of BERpost as 
a function of the normalized GMI are shown in Eig. [3(c). 
These results show that for a given code rate, changing the 
constellation does not affect the post-EEC BER prediction 
based on the GMI. More importantly, and unlike for BERpost 
and MI, the prediction based on the GMI appears to work 
across all code rates. These results in fact show that the 
considered TCs appear to be universal (with respect to the 
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Rc 

(b) Normalized GMI 


Fig. 5. Required values for the different metrics to give BE^ost = 4.7-10“^ 
as a function of the code rate for the same cases as in Fig. [5] (a) normalized 
MI and (b) normalized GMI. The cui'ves I{X; y) = mRc and GMI = mRc 
are shown in (a) and (b), resp. 


GMI), which, to the best of our knowledge, has never been 
shown in the literature. 

Fig. |5] shows the values of MI and GMI needed for each 
configuration in Fig. [2 to reach a post-FEC BER of BERpost = 
4.7-1O-30 These values are obtained by finding the crossing 
points of the curves in Eig. [2 and the horizontal dashed 
lines. Eig. 12 also shows the relationships I{X;Y) = mRc 
and GMI = mRc, where the vertical difference between the 
markers and the solid lines represent the rate penalty for 
these codes. The results in Eig. |2 clearly show the excellent 
prediction based on GMI and how MI does not work well 
across different modulation formats. 

We showed before that the pre-EEC BER can lead to 
an erroneous estimate of the spectral efficiency, which is 
particularly noticeable for low code rates. A similar problem 
occurs if the normalized Ml in Eig.|2(a) is used to predict post- 
EEC BER. Eor example, the results in Eig.|2(a) show that post- 
EEC BER of BERpost = 4.7-10“^ can be achieved with 4QAM 
and Rc = 2/3 when the normalized Ml is approximately 0.71 
(see also Eig. [2 (b)). One might be tempted to then conclude 
that, for the same Ml, the same post-EEC BER can be achieved 
with 8QAM and Rc = 2/3. The results in Eig. 0 (a) show 
that this is in fact not possible, and a (lower) code rate of 
i?c = 3/5 is needed. In other words, the use of a “MI threshold 
paradigm” could lead to an overestimation (in this case by 
11%) of the true spectral efficiency. This is not the case for 
the GMI (see Eig. 0(b)), where all markers for the same code 
fall on top of one another. 

"Note that similar results could be presented in terms of pre-FEC BER. 
To have a fair comparison in terms of rates, however, one would need to 
convert the pre-FEC BER into SNR, and then map that SNR onto MI (or 
GMI), giving exactly what is shown in Fig. [3] 



Fig. 6. Achievable rates (per polarization) versus span length: GMI (solid 
lines) and LDPC codes with Rc S {1/3,1/2, 3/4, 9/10} (markers). 


B. Optical Channel—Simulations 

Dual-polarization transmission over the nonlinear optical 
channel specified in Sec. Ill-AI was simulated using the cou¬ 
pled polarization nonlinear Schrddinger equation (NLSE) ll^ 
eq. (6)]. This enabled the consideration of an idealized trans¬ 
mission link with zero polarization mode dispersion. The 
simulations were carried out via the split-step Eourier method 
with a step size of 100 m and an oversampling factor of 
4 samples/symbol. 

Eig. 0 shows the GMI (per polarization) as a function 
of the span length, for MQAM constellations with M — 
4,16, 64,256. Eor each distance and M, we used the launch 
power that gave the highest GMI. In this figure, we also 
show the distance required by the LDPC codes in Sec. III-AI 
to give BERpost = 4.7 • 10“^ for each combination of four 
constellations and Rc € {1/3,1/2,3/4,9/10}. The vertical 
position of these 16 markers represent the resulting achievable 
rates and clearly show that the results follow the GMI curves. 
This is in good agreement with the results in m, El, 
where it was shown that the GMI can be used to predict 
the performance of LDPC codes for the AWGN channel. The 
penalties with respect to the GMI are between 5 and 15 km 
and are highest for high code rates and large values of M. 
These penalties are caused by the suboptimality of the LDPC 
code under consideration. 

In analogy with Pig. 0 Pig. 0 shows the post-PEC BER 
as a function of (a) pre-PEC BER, (b) normalized MI, and 
(c) normalized GMI. The results for the NLSE are shown 
with filled markers and show that the prediction based on 
the GMI is excellent. Just as for TCs, the prediction based 
on pre-PEC BER does not always work, however, a relatively 
good approximation is obtained for high code rates. 

When compared to the results in llT5l Pig. 5], we note 
that the curves in Pig. 0 (c) are more “compact” for low 
rates. The difference between the simulation setup in llTSl 
and the one in this paper is that here we consider a random 
interleaver between the binary encoder and the mapper. Using 
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(a) Post-FEC BER vs. pre-EEC BER 





GMI/m 

(b) 



Fig. 7. Post-FEC BER for LPDC codes with Rc £ {1/3,1/2, 3/4, 9/10} 
(colors) over lineal' and nonlinear channels and different modulation formats 


Fig. 8. (a) Pre-FEC BER and post-FEC BER as a function of the launch 

power for an LDPC code with Rc = 3/4, 64QAM, and L = 210 km. (b) 
Pre-FEC BER vs. normalized GMI in the linear (white markers) and nonlinear 
(filled markers) regimes. 


this interleaver is thus important to make the GMI-based 
prediction even more precise. 

In Fig. |2] we also show results obtained for the AWGN 
channel (white markers). These results were obtained for the 
same modulation and coding pairs as used in the NLSE 
simulations and show that indeed the GMI is a robust metric to 
predict post-FEC BER across different channels. In particular, 
Fig. |7] (c) shows that the post-EEC BER predictions give the 
same results for both the AWGN channel and the simulations 
based on the NLSE. This also suggests that using a Gaussian 
model for the noise is quite reasonable. 

All the results in Eig. [T] (c) for the NLSE were obtained for 
the optimal launch power. To show that the GMI prediction 
is also not dependent on the launch power, we study a fixed 


(markers): 4QAM, 16QAM, 64QAM, and 256QAM. The post-FEC BER is 
shown versus (a) pre-FEC BER, (b) normalized MI, and (c) normalized GMI. 
All the L-values are calculated using {29). 


distance and vary the launch power, bringing the system deep 
into the nonlinear regime. As the modulation format, we 
choose 64QAM and based on the results in Eig. 0 we use 
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Fig. 9. 64QAM Nyquist spaced WDM transmission testbed. 


L = 210 km and Rc = 3/4. The launch power was varied 
from 2.6 dBm to 12.6 dBm, giving the pre-FEC and post-FEC 
BER shown in Eig.|^(a). The same post-FEC BER values are 
shown in Fig. 0(b) as a function of the normalized GMI. This 
figure shows once again that the GMI can be used to accurately 
predict the post-FEC BER of SD-FEC decoders, even when 
the channel is highly nonlinear. 


C. Optical Channel—Experiments 

To experimentally verify that the normalized GMI is an ac¬ 
curate predictor for post-FEC BER, the LDPC code described 
in Sec. Ill-DI was implemented in a dual-polarization 64QAM 
Nyquist-spaced WDM transmission system. The correspond¬ 
ing experimental setup is illustrated in Fig. 0 A 100 kHz 
linewidth external cavity laser (ECL) was passed through an 
optical comb generator (OCG) to obtain seven frequency- 
locked comb lines with a channel spacing of 10.01 GHz. 
The eight-level drive signals required for 64QAM were gen¬ 
erated offline in Matlab and were digitally Altered using an 
RRC Alter with a roll-off factor of 0.1%. The resulting in- 
phase (I) and quadrature (Q) signals were loaded onto a 
pair of field-programmable gate arrays (FPGAs) and output 
using two digital-to-analog converters (DACs) operating at 
20 Gsamples/s (2 samples/symbol). The odd and even sub¬ 
carriers were independently modulated using two complex 
IQ modulators, which were subsequently decorrelated be¬ 
fore being combined and polarization multiplexed to form a 
Nyquist spaced 64QAM super-carrier. The recirculating loop 
configuration consisted of two acousto-optic switches (AOS), 
two EDFAs with a noise figure of 4.5 dB, an optical band-pass 
Alter (BPF) for amplifled spontaneous emission noise removal, 
a loop-synchronous polarization scrambler (PS) and a single 
81.8 km span of Corning® SMF-28® ULL flber. 

The polarization-diverse coherent receiver had an electrical 
bandwidth of 70 GHz and used a second 100 kHz linewidth 
ECL as a local oscillator (LO). The frequency of the LO was 
set to coincide with the central sub-carrier of the 64QAM 
super-carrier and the received signals were captured using a 
160 Gsamples/s real-time sampling oscilloscope with 63 GHz 
analog electrical bandwidth. DSP and SD-FEC decoding were 
subsequently performed offline in Matlab and was identical to 
that described in IMI- 



Fig. 10. Post-FEC BER for the 64QAM Nyquist spaced WDM transmission 
testbed with LPDC codes and Rc E {2/5,1/2, 3/5, 3/4, 9/10} (colors) as 
a function of the normalized GMI. Experimental results for different number 
of spans Ns, are shown with markers and AWGN results with solid lines. 


The transmission performance of the central WDM car¬ 
rier was analyzed over a number of transmission distances 
from 81.8 km (A/j = 1) to 1308.8 km (A/j = 16) and 
for a number of launch powers, ranging from —18 dBm to 
+2 dBm. This resulted in a normalized GMI ranging from 
0.39 to 0.93, which required adaptation of the OH in order to 
achieve a post-FEC BER that was below the target BER after 
SD-FEC decoding BERpost = 4.7 • 10“^. Fig. [10] illustrates 
the experimentally measured normalized GMI (markers) as 
a function of post-FEC BER, for five code rates Rc € 
{2/5,1/2,3/5,3/4,9/10}. The transmission distances were 
81.8,327.2,654.4 and 1308.8 km, i.e., = 1,4,8, and 16 

spans. The simulated results obtained for an AWGN channel 
(i.e., the ones in Fig. |7| (c)) are displayed using solid lines. 
Excellent agreement between the simulated curves and the 
experimental points is demonstrated for all SD-FEC code rates, 
launch powers, and distances, even though the simulations and 
experiments concern entirely different channels 0 

Each result shown with a marker in Fig. [T0| corresponds 
to a given launch power (per channel), code rate Rc, and 
number of spans TVs. These results are summarized in Table HU 
where the launch powers are also shown. The results in Fig.fTOl 
and Table ID show that regardless of the transmit power, the 
normalized GMI can indeed be used to predict the post-FEC 
BER. These results can be seen as experimental validation of 
those presented in Fig. 0 

V. Conclusions 

This paper studied the GMI as a powerful tool to predict the 
post-FEC BER of soft-decision EEC. The GMI was measured 
in experiments and simulations, and for all the considered 
scenarios proved to be very robust. The GMI can be used 

*^Note also that the parameters of the experimental setup in this section 
are different to those in Sec. IIV-BI 
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TABLE II 

Summary of results for the experimental setup in Fig. [9] Each 

ROW CORRESPONDS TO A MARKER IN FlG.ITOl 


Launch Power 

GMI/m 

BERpost 

Spans Ns, 

Rate i7c 

-18.27 dBm 

0.39 

1.7- 

10-^ 

16 

2/5 

-17.00 dBm 

0.44 

5.0 ■ 

10-2 

16 

-17.00 dBm 

0.44 

1.7- 

10-^ 

16 


-15.90 dBm 

0.50 

1.3 ■ 

10-1 

16 

1/2 

-14.80 dBm 

0.55 

4.9 ■ 

10-2 

16 

-13.69 dBm 

0.56 

1.5 ■ 

10-® 

16 


-18.12 dBm 

0.53 

1.5 ■ 

10-2 

8 


-17.14 dBm 

0.57 

1.3 ■ 

10-2 

8 

3/5 

-15.94 dBm 

0.61 

9.9 ■ 

10-3 

8 

-18.21 dBm 

0.64 

1.3 ■ 

10-5 

4 


-18.20 dBm 

0.66 

1.0 ■ 

10-1 

4 


-17.20 dBm 

0.69 

8.8 ■ 

10-2 

4 


-16.01 dBm 

0.73 

6.4 ■ 

10-2 

4 

3/4 

-15.0 dBm 

0.77 

3.3 ■ 

10-2 

4 


—0.60 dBm 

0.78 

2.7- 

10-4 

4 


-9.17 dBm 

0.87 

2.9 ■ 

10-2 

1 


-10.17 dBm 

0.89 

2.2 ■ 

10-2 

1 


-11.24 dBm 

0.91 

1.2 ■ 

10-2 

1 

9/10 

-12.24 dBm 

0.92 

8.2 ■ 

10-4 

1 


-13.30 dBm 

0.93 

3.4- 

10-5 

1 



to predict the post-FEC BER without actually encoding and 
decoding data. 

The pre-EEC BER and MI were also shown to be weak 
predictors of the performance of soft-decision EEC for bit¬ 
wise decoders. The so-called EEC limit is, hence, an unreliable 
design criterion for optical communication systems with soft- 
decision EEC. On the other hand, the GMI was found to give 
very good results for all code rates, all considered modulation 
formats, LDPC and turbo codes, exact and approximated L- 
values, and for both linear and nonlinear optical transmission. 
We suggest to replace the “SD-EEC limit” (used for many 
years with hard-decision decoding and now becoming increas¬ 
ingly popular with soft decision) with a “GMI limit”, which 
is relevant for modern optical communication systems. 

This paper considered only noniterative binary decoding. 
Different results are expected if a (capacity-approaching soft- 
decision) nonbinary decoder or a binary decoder with iterative 
detection (i.e., with soft information being exchanged itera¬ 
tively between the decoder and demapper) are used. In these 
cases, we conjecture the MI to be the correct metric to predict 
the post-EEC BER. This comparison is left for future work. 
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